AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Voice Agent Interaction

# Voice Agent Interaction

Ultravox V0 3
MIT
Ultravox is a multimodal speech large language model based on Llama3.1-8B-Instruct and Whisper-small, capable of processing both speech and text inputs.
Audio-to-Text Transformers English
U
FriendliAI
20
1
Ultravox V0 4 1 Llama 3 3 70b
MIT
Ultravox is a multimodal speech large language model based on Llama3.3-70B-Instruct and whisper-large-v3-turbo, capable of processing both speech and text inputs.
Audio-to-Text Transformers Supports Multiple Languages
U
fixie-ai
26
10
Ultravox V0 4 1 Mistral Nemo
MIT
Ultravox is a multimodal model based on Mistral-Nemo and Whisper, capable of processing both speech and text inputs, suitable for tasks like voice agents and speech translation.
Audio-to-Text Transformers Supports Multiple Languages
U
fixie-ai
1,285
25
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase